Scalable Source Code Similarity Detection in Large Code Repositories

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient plagiarism detection for large code repositories

Unauthorized re-use of code by students is a widespread problem in academic institutions, and raises liability issues for industry. Manual plagiarism detection is time-consuming, and current effective plagiarism detection approaches cannot be easily scaled to very large code repositories. While there are practical text-based plagiarism detection systems capable of working with large collections...

متن کامل

Analysis of Source Code Repositories

Source code repositories are designed to store a huge amount of source code. They also collect indirectly information useful to analyze the development process. Usually, the last set of data is not used at all due to the lack of specialized tools to collect and analyze such data. This paper presents the early stages of a tool designed to perform acquisition and analysis of data stored in source...

متن کامل

Source Code Repositories and Agile Methods

Source repositories are a promising database of information about software projects. This paper proposes a tool to extract and summarize information from CVS logs in order to identify whether there are differences in the development approach of Agile and non-Agile teams. The tool aims to improve empirical investigation of the Agile Methods (AMs) without affecting the way developers write code. ...

متن کامل

Efficient and Effective Plagiarism Detection for Large Code Repositories

ABSTRACT: The copying of programming assignments is a widespread problem in academic institutions. Manual plagiarism detection is time-consuming, and current popular plagiarism detection systems are not scalable to large code repositories. While there are text-based plagiarism detection systems capable of handling millions of student papers, comparable systems for codebased plagiarism detection...

متن کامل

Model-Based Mining of Source Code Repositories

The Mining Software Repositories (MSR) field analyzes the rich data available in source code repositories (SCR) to uncover interesting and actionable information about software system evolution. Major obstacles in MSR are the heterogeneity of software projects and the amount of data that is processed. Model-driven software engineering (MDSE) can deal with heterogeneity by abstraction as its cor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ICST Transactions on Scalable Information Systems

سال: 2019

ISSN: 2032-9407

DOI: 10.4108/eai.13-7-2018.159353